Skip to content

feat(bokeh): implement timeseries-forecast-uncertainty#7402

Merged
MarkusNeusinger merged 5 commits into
mainfrom
implementation/timeseries-forecast-uncertainty/bokeh
May 19, 2026
Merged

feat(bokeh): implement timeseries-forecast-uncertainty#7402
MarkusNeusinger merged 5 commits into
mainfrom
implementation/timeseries-forecast-uncertainty/bokeh

Conversation

@claude
Copy link
Copy Markdown
Contributor

@claude claude Bot commented May 19, 2026

Summary

  • Regenerated bokeh implementation from quality 92 with targeted improvements
  • Added HoverTool for both historical and forecast lines (showing date, sales, and CI bounds on hover)
  • Removed enclosing box outline (outline_line_color=None) for L-shaped frame matching style guide

Plan

N/A

Test plan

  • Verify plot-light.png shows warm off-white background (#FAF8F1) with green historical line and vermillion dashed forecast
  • Verify plot-dark.png shows near-black background (#1A1A17) with identical data colors
  • Verify title reads timeseries-forecast-uncertainty · python · bokeh · anyplot.ai
  • Verify legend shows all 4 items: Historical Data, Forecast, 80% CI, 95% CI
  • Verify no enclosing box around plot area (only left and bottom axis lines)
  • Verify HoverTool works in HTML renders showing CI bounds

claude Bot and others added 2 commits May 19, 2026 13:18
Regen from quality 92. Addressed:
- Added HoverTool for both historical and forecast lines (with CI bounds)
- Removed enclosing box outline (outline_line_color=None) for L-shaped frame
- Canvas corrected to 3200x1800 (from stale 4800x2700)
- Font sizes aligned with style guide (18pt/14pt/12pt)
- Title updated to include python language token
@claude
Copy link
Copy Markdown
Contributor Author

claude Bot commented May 19, 2026

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): The plot has a warm off-white #FAF8F1 background. The title "timeseries-forecast-uncertainty · python · bokeh · anyplot.ai" appears in dark #1A1A17 text at the top-left and is fully readable. The Y-axis label "Sales (thousands)" and tick values (80, 100, 120, 140, 160) are clearly visible. X-axis date tick labels appear at the bottom in INK_SOFT gray and are readable. A solid green (#009E73) line traces 36 months of historical sales with clear trend and seasonal variation. A dashed orange (#D55E00) line shows the 12-month forecast starting at the right-hand vertical dashed separator. Two nested reddish-purple (#CC79A7) confidence bands surround the forecast — the 80% band at alpha=0.30 is clearly visible, while the 95% outer band at alpha=0.15 is lighter but distinguishable. The legend sits in the top-left corner with a warm elevated background. Y-axis-only grid lines are very subtle (alpha=0.10). All text is readable against the light background. Legibility verdict: PASS.

Dark render (plot-dark.png): The background is near-black #1A1A17 as expected. The title, Y-axis label, and tick values all render in light-colored text (INK / INK_SOFT) and are clearly legible against the dark surface — no dark-on-dark failures detected. The historical green line (#009E73) is identical in color to the light render. The dashed orange forecast line is also identical. The confidence bands retain their reddish-purple hue but appear more saturated/muted against the dark background; the 95% outer band at alpha=0.15 is noticeably faint but still visible. Legend background uses #242420 (ELEVATED_BG) with light label text — readable. The vertical dashed separator is visible. No dark-on-dark failures for labels or titles. Legibility verdict: PASS.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 84/100

Category Score Max
Visual Quality 26 30
Design Excellence 11 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 84 100

Visual Quality (26/30)

  • VQ-01: Text Legibility (6/8) — All font sizes explicitly set per style guide. Title, axis labels, y-axis ticks clearly readable in both themes. X-axis date labels appear present but are faint at preview resolution; no overflow or clipping detected.
  • VQ-02: No Overlap (6/6) — No overlapping elements. Legend top-left clears all data elements.
  • VQ-03: Element Visibility (5/6) — Historical and forecast lines well-visible at line_width=3. The 95% CI outer band at fill_alpha=0.15 is quite faint in both themes, especially dark.
  • VQ-04: Color Accessibility (2/2) — CVD-safe Okabe-Ito palette; green vs orange distinction doesn't rely on red-green hue alone.
  • VQ-05: Layout & Canvas (3/4) — Good canvas utilization. y_range.end=175 with data maxing ~165 leaves noticeable empty space in the upper third of the plot.
  • VQ-06: Axis Labels & Title (2/2) — Y-axis "Sales (thousands)" has units; X-axis "Date" is acceptable for a datetime index.
  • VQ-07: Palette Compliance (2/2) — First series (historical) is #009E73 ✓; forecast uses Okabe-Ito pos 2 #D55E00 ✓; CI bands use pos 4 #CC79A7 ✓; backgrounds #FAF8F1/#1A1A17 ✓; both renders theme-correct.

Design Excellence (11/20)

  • DE-01: Aesthetic Sophistication (4/8) — Looks like a well-configured style-guide-compliant default. Professional and clean, but the CI bands use reddish-purple (#CC79A7) while the forecast line is orange (#D55E00), creating a color-family disconnect that reduces visual harmony.
  • DE-02: Visual Refinement (4/6) — X-grid disabled, Y-grid at alpha=0.10 (very subtle), L-shaped frame via outline_line_color=None, theme-adaptive legend and chrome. Clearly above defaults.
  • DE-03: Data Storytelling (3/6) — Solid/dashed line contrast and vertical separator create a reasonable historical-vs-forecast narrative. However, the mismatch between the orange forecast line and the purple CI bands weakens the visual story that "uncertainty belongs to this forecast."

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct: time series with forecast projection and confidence bands.
  • SC-02: Required Features (4/4) — Solid historical line, dashed forecast, vertical forecast-start line, nested 80%+95% CI bands, semi-transparent fills (alpha 0.15–0.30), connection line from last historical to first forecast point, full legend.
  • SC-03: Data Mapping (3/3) — Datetime X-axis; sales Y-axis; all data within y_range.
  • SC-04: Title & Legend (3/3) — Title is exactly timeseries-forecast-uncertainty · python · bokeh · anyplot.ai. Legend labels: "Historical Data", "Forecast", "80% Confidence Interval", "95% Confidence Interval" — all correct.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Shows full feature set: trend + seasonality in historical, growing forecast uncertainty, both 80% and 95% CI bands demonstrating nested confidence levels.
  • DQ-02: Realistic Context (5/5) — Monthly product sales with 3-year history and 12-month ARIMA/Prophet-style forecast. Neutral, realistic business scenario.
  • DQ-03: Appropriate Scale (4/4) — Sales values 80–165 thousand units are realistic; uncertainty growing proportionally over time is physically correct.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Clean linear flow: imports → theme tokens → data generation → figure → styling → glyphs → legend → save.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42).
  • CQ-03: Clean Imports (2/2) — All imported symbols are used.
  • CQ-04: Code Elegance (2/2) — Clean Pythonic code; connection line between history and forecast end-points is an elegant touch.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.html and plot-{THEME}.png; current Bokeh/Selenium API.

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — ColumnDataSource used throughout, Span model for vertical reference line, explicit Legend with add_layout(), HoverTool with formatters — all idiomatic Bokeh patterns.
  • LM-02: Distinctive Features (3/5) — HoverTool with CI columns exposed in tooltip is distinctly Bokeh-interactive; Span for forecast boundary is an elegant Bokeh-native solution. HTML export + Selenium screenshot follows the required Bokeh pattern.

Score Caps Applied

  • None — DE-01=4 > 2, so "correct but boring" cap does not apply.

Strengths

  • Perfect spec compliance: all required features (historical line, dashed forecast, vertical separator, nested 80%/95% CI bands, connection line) are present and working.
  • Proper theme adaptation throughout — backgrounds, text, grid, legend all use the correct tokens for both light and dark renders. No dark-on-dark failures.
  • Idiomatic Bokeh usage: Span, ColumnDataSource, HoverTool with CI data, and Legend via add_layout() are all library-correct patterns.
  • Clean, reproducible code with np.random.seed(42) and a linear KISS structure.
  • Excellent data quality: realistic sales scenario with trend, seasonality, and growing uncertainty.

Weaknesses

  • CI bands use #CC79A7 (Okabe-Ito pos 4, reddish-purple) while the forecast line uses #D55E00 (Okabe-Ito pos 2, orange). The CI bands should use the same color family as the forecast — change CI bands to #D55E00 with fill_alpha=0.25 (80%) and fill_alpha=0.12 (95%), which creates the visual message that "this uncertainty belongs to this forecast.\"
  • The 95% CI outer band at fill_alpha=0.15 is too faint, especially in dark theme. With the orange color suggestion above, 0.12 outer and 0.25 inner would still be distinguishable.
  • y_range.end=175 leaves ~30 units of empty space above the forecast peak (~165). Reduce to y_range.end=168 or let Bokeh auto-range to reduce vertical whitespace.
  • Missing p.xaxis.major_tick_line_color = INK_SOFT and p.yaxis.major_tick_line_color = INK_SOFT — tick mark lines may default to black in dark theme (dark-on-dark minor issue).

Issues Found

  1. DE-01/DE-03 COLOR MISMATCH: CI bands are purple (#CC79A7) while forecast line is orange (#D55E00). These belong to different Okabe-Ito families and break the "forecast + its uncertainty" visual grouping.
    • Fix: Change band_95 and band_80 fill_color to OKABE_ITO_2 (#D55E00) with fill_alpha=0.12 and 0.25 respectively.
  2. VQ-05 EMPTY SPACE: y_range.end=175 is 10–15 units above the 95% CI upper bound, creating wasted canvas space.
    • Fix: Set p.y_range.start = 60 and p.y_range.end = 170 (or remove the explicit range and let Bokeh auto-range with padding).
  3. MINOR CHROME GAP: Tick mark lines not explicitly colored with INK_SOFT.
    • Fix: Add p.xaxis.major_tick_line_color = INK_SOFT and p.yaxis.major_tick_line_color = INK_SOFT.

AI Feedback for Next Attempt

The primary fix needed is color coherence: change the confidence-band fill color from #CC79A7 (purple) to #D55E00 (orange/vermillion, same as the forecast line) with fill_alpha=0.25 (80% CI) and fill_alpha=0.12 (95% CI). This one change dramatically improves both DE-01 (color harmony) and DE-03 (storytelling — the bands visually "belong" to the forecast line). Also reduce y_range.end from 175 to ~170 to eliminate the empty upper space. Add p.xaxis.major_tick_line_color = INK_SOFT and p.yaxis.major_tick_line_color = INK_SOFT for complete theme-adaptive chrome.

Verdict: REJECTED

@github-actions github-actions Bot added quality:84 Quality score 84/100 ai-rejected Quality not OK, triggers update labels May 19, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels May 19, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor Author

claude Bot commented May 19, 2026

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): Background is warm off-white (#FAF8F1) as required. Title "timeseries-forecast-uncertainty · python · bokeh · anyplot.ai" renders in dark ink and is clearly readable against the light surface. The historical data line is #009E73 (Okabe-Ito position 1, brand green) as a solid 3px line spanning 36 months of monthly sales data (~80–125 thousands range). The forecast line uses #D55E00 (OI position 2, vermillion) as a dashed line extending 12 months beyond the historical period. Two nested confidence bands fill the forecast region using #CC79A7 (OI position 4) at fill_alpha=0.30 (80% CI, visibly darker) and fill_alpha=0.15 (95% CI, lighter). A dashed vertical line marks the forecast start. Legend in top-left with four items is readable. Y-axis tick labels (80, 100, 120, 140, 160) are visible. X-axis date labels are present but small. The CI bands fill extends to the right canvas edge with minimal whitespace. The upper y-axis region (~145–175) is largely empty, creating wasted vertical canvas space. All text is readable against the light background.

Dark render (plot-dark.png): Background is warm near-black (#1A1A17) as required. Title, y-axis label "Sales (thousands)", and all tick labels render in light-colored ink (INK_SOFT token correctly applied). The #009E73 historical data line is identical to the light render — data colors are consistent between themes as required. The forecast dashed line in #D55E00 remains visible. The CI bands take on a darker brownish-orange appearance due to alpha blending against the near-black background — this is expected and both bands remain visually distinguishable from each other. The legend is rendered with ELEVATED_BG (#242420) fill and INK_SOFT label text — readable. No "dark on dark" failures observed: all text elements are clearly light-colored against the dark surface. The plot background area is visibly darker than the border fill, which is also #1A1A17, so the boundary is subtle but correct.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 88/100

Category Score Max
Visual Quality 27 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 8 10
Total 88 100

Visual Quality (27/30)

  • VQ-01: Text Legibility (7/8) — All font sizes explicitly set (title 18pt, axis labels 14pt, tick labels 12pt, legend 12pt). Proportions are good and readable at desktop scale. Minor deduction: x-axis datetime tick labels appear sparse and could be tight at mobile scale (~400px).
  • VQ-02: No Overlap (6/6) — No overlapping text or data collisions. Legend sits cleanly in the upper-left without intersecting the historical line.
  • VQ-03: Element Visibility (5/6) — Lines and bands are visible. Minor issue: the dashed forecast line (#D55E00, 3px) is partially obscured within the confidence band fill, especially in the light render where the peach-toned fill reduces contrast on the thin dashed line.
  • VQ-04: Color Accessibility (2/2) — Okabe-Ito palette is colorblind-safe. Green/orange combination provides adequate luminance contrast.
  • VQ-05: Layout & Canvas (3/4) — Plot fills the canvas well horizontally. Deduction: y-range set to 55–175 while actual data range is ~75–165, leaving ~30% of the vertical canvas as empty whitespace above the grid. CI bands also reach the right canvas edge without padding, giving a slightly clipped appearance on the right side.
  • VQ-06: Axis Labels & Title (2/2) — X-axis: "Date"; Y-axis: "Sales (thousands)" with units. Title is correctly formatted.
  • VQ-07: Palette Compliance (2/2) — First series (historical) = #009E73 ✓. Second series (forecast) = #D55E00 ✓. CI bands use OI position 4 (#CC79A7), skipping position 3 — intentional design grouping the bands visually away from the line series. Background #FAF8F1/#1A1A17 correct. Chrome (text, grid, legend) is fully theme-adaptive in both renders.

Design Excellence (13/20)

  • DE-01: Aesthetic Sophistication (5/8) — Above the "well-configured default" baseline. Thoughtful choices: distinct color assignment for historical vs. forecast, nested CI bands with appropriate alpha layering, clean L-shaped spine removal, theme-adaptive tokens throughout. Not quite publication-ready (no typographic hierarchy variation, no focal-point emphasis) but clearly above defaults.
  • DE-02: Visual Refinement (4/6) — Y-axis-only grid at 10% opacity is well-calibrated. Top/right spines removed via outline_line_color = None. Legend styled with elevated background and border. Deduction: wasted vertical whitespace (y-range too wide), and no explicit major_tick_line_color on axes (library defaults used for tick marks).
  • DE-03: Data Storytelling (4/6) — The plot communicates the forecast narrative clearly: vertical divider marks the regime change, growing CI bands convey increasing uncertainty over time, green→orange color shift distinguishes observed from projected. A viewer immediately understands the structure without reading the legend. Slight room for improvement: no annotation calling out the forecast start date or peak forecast value.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct: time series with historical period, forecast projection, and nested 80%/95% confidence bands.
  • SC-02: Required Features (4/4) — All spec features present: solid historical line, dashed forecast line, vertical forecast-start marker, 80% CI band, 95% CI band, legend.
  • SC-03: Data Mapping (3/3) — Date on x-axis, Sales (thousands) on y-axis. All 48 data points (36 historical + 12 forecast) correctly mapped.
  • SC-04: Title & Legend (3/3) — Title: "timeseries-forecast-uncertainty · python · bokeh · anyplot.ai" ✓. Legend: "Historical Data", "Forecast", "80% Confidence Interval", "95% Confidence Interval" — all present and correct.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Shows all aspects: upward trend, seasonal oscillation in history, growing uncertainty width over forecast horizon, two CI levels, smooth handoff at forecast boundary.
  • DQ-02: Realistic Context (5/5) — Monthly product sales scenario with 3-year history and 12-month forecast is a real, neutral, business-relevant use case.
  • DQ-03: Appropriate Scale (4/4) — Sales ranging 80–140 thousand with ±5–25k uncertainty growth is proportionally realistic for a mid-sized product line. Seasonal amplitude of ~15k is plausible.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Flat: theme tokens → data generation → figure + styling → glyphs → legend → save. No functions or classes.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set.
  • CQ-03: Clean Imports (2/2) — All imports used: ColumnDataSource, HoverTool, Legend, Span are all referenced.
  • CQ-04: Code Elegance (2/2) — Idiomatic patch-polygon approach for CI bands. HoverTool configuration for both series lines is clean and well-parameterized.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.html + plot-{THEME}.png via Selenium screenshot pattern. Current Bokeh 3.x API.

Library Mastery (8/10)

  • LM-01: Idiomatic Usage (5/5) — Correct use of ColumnDataSource for all renderers, Span for vertical line, manual Legend with explicit items, HoverTool with formatters={"@x": "datetime"} and mode="vline". Patch-polygon technique for CI bands is the correct Bokeh idiom.
  • LM-02: Distinctive Features (3/5) — Two HoverTools with CI data formatted for the forecast line is a Bokeh-distinctive capability. HTML export as a required artifact leverages Bokeh's interactivity advantage. Falls short of 5: no CustomJS callbacks, no Band model (Bokeh has a native Band annotation that would be more idiomatic than the patch-polygon approach for uncertainty bands).

Score Caps Applied

  • None applied. DE-01=5 (>2) and DE-02=4 (>2), so the "correct but boring" cap does not trigger.

Strengths

  • Full spec compliance — all required features (historical line, forecast, 80%/95% CI bands, vertical divider, legend) implemented correctly
  • Perfect data quality — realistic monthly sales scenario with plausible trend, seasonality, and growing forecast uncertainty
  • Proper Bokeh idioms: ColumnDataSource throughout, Span for the divider, dual HoverTools with datetime formatters and CI data
  • Theme-adaptive chrome implemented correctly in both renders — no dark-on-dark failures
  • Nested CI band layering (95% drawn first, 80% on top) creates correct visual nesting

Weaknesses

  • Y-axis range (55–175) is wider than needed for the data range (~75–165), leaving ~30% of the vertical canvas as empty whitespace; tighten to ~70–170 or compute dynamically from min(lower_95) - 10
  • CI bands reach the right canvas edge with no x-axis padding, giving a slightly clipped appearance; add a small x-range buffer after the last forecast date
  • Dashed forecast line (3px, #D55E00) becomes hard to see inside the confidence band fill — increase to line_width=4 or use a lighter CI band alpha to preserve contrast
  • major_tick_line_color not explicitly set for either axis; in dark mode this may fall back to library defaults — set p.xaxis.major_tick_line_color = INK_SOFT and p.yaxis.major_tick_line_color = INK_SOFT
  • Bokeh's native Band annotation model would be more idiomatic than the patch-polygon workaround for CI regions

Issues Found

  1. VQ-05 MINOR: Y-axis range too wide (55–175 vs data range ~75–165), creating ~30% wasted vertical space.
    • Fix: Set p.y_range.start = max(0, np.min(lower_95) - 10) and p.y_range.end = np.max(upper_95) + 10
  2. VQ-05 MINOR: No x-axis padding beyond the last forecast date — CI bands appear to clip at the right canvas edge.
    • Fix: Add p.x_range.end = dates_forecast[-1] + pd.DateOffset(months=1) after figure creation
  3. VQ-03 MINOR: Forecast dashed line partially obscured by CI band fills.
    • Fix: Increase forecast line to line_width=4 for better contrast within the bands
  4. LM-02: Patch-polygon CI bands could use Bokeh's native Band model for cleaner code and better interactivity.
    • Fix: from bokeh.models import Band; Band(base=\"x\", lower=\"lower_80\", upper=\"upper_80\", source=source, ...)

AI Feedback for Next Attempt

Focus on layout and visibility refinements: (1) tighten the y-axis range to fit the actual data plus a small buffer to eliminate wasted vertical space; (2) add a small x-axis right-side buffer so CI bands don't clip at the canvas edge; (3) increase forecast line width to 4px to stay visible within the CI fills; (4) set major_tick_line_color on both axes to INK_SOFT. Consider replacing patch-polygon CI bands with Bokeh's native Band annotation model for cleaner code. These changes should push the score from 88 to 92+.

Verdict: APPROVED

@github-actions github-actions Bot added quality:88 Quality score: 88/100 ai-approved Quality OK, ready for merge and removed quality:84 Quality score 84/100 labels May 19, 2026
@MarkusNeusinger MarkusNeusinger merged commit c357263 into main May 19, 2026
@MarkusNeusinger MarkusNeusinger deleted the implementation/timeseries-forecast-uncertainty/bokeh branch May 19, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:88 Quality score: 88/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant